Extraembryonic gut endoderm cells undergo programmed cell death during development

Despite a distinct developmental origin, extraembryonic cells in mice contribute to gut endoderm and converge to transcriptionally resemble their embryonic counterparts. Notably, all extraembryonic progenitors share a non-canonical epigenome, raising several pertinent questions, including whether this landscape is reset to match the embryonic regulation and if extraembryonic cells persist into later development. Here we developed a two-colour lineage-tracing strategy to track and isolate extraembryonic cells over time. We find that extraembryonic gut cells display substantial memory of their developmental origin including retention of the original DNA methylation landscape and resulting transcriptional signatures. Furthermore, we show that extraembryonic gut cells undergo programmed cell death and neighbouring embryonic cells clear their remnants via non-professional phagocytosis. By midgestation, we no longer detect extraembryonic cells in the wild-type gut, whereas they persist and differentiate further in p53-mutant embryos. Our study provides key insights into the molecular and developmental fate of extraembryonic cells inside the embryo.


Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.

A description of all covariates tested
A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g.means) or other basic estimates (e.g.regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g.confidence intervals) For null hypothesis testing, the test statistic (e.g.F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.
For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g.Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above.

Software and code
Policy information about availability of computer code Data collection 10x Genomics Cell Ranger (version 6.0.1),deMULTIplex (version 1.0.2),cutadapt (version 4.1), STAR (version 2.7.9a), stringtie (version 2.0.For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers.We strongly encourage code deposition in a community repository (e.g.GitHub).See the Nature Portfolio guidelines for submitting code & software for further information.
-Accession codes, unique identifiers, or web links for publicly available datasets -A description of any restrictions on data availability -For clinical datasets or third party data, please ensure that the statement adheres to our policy Sequencing data sets generated within the scope of this study have been deposited in the Gene Expression Omnibus under accession no.GSE250084.scRNA-seq data sets of E8.75 gut endoderm and E9.5-E15.5 gastrointestinal tract were obtained from GSE123046 and GSE186525, respectively.WGBS data sets of wild type E6.5 epiblast were obtained from GSE137337.The mouse reference genome mm10 was obtained from UCSC (https://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/).Annotations of CpG islands for mm10 were downloaded from UCSC (https://genome.ucsc.edu/cgi-bin/hgTables).The mm10 gene annotation was downloaded from GENCODE (VM23, https://www.gencodegenes.org/mouse/release_M23.html).Source data are provided at https://doi.org/10.5281/zenodo.10926934.All other data supporting the findings of this study are available from the corresponding author on reasonable request

Human research participants
Policy information about studies involving human research participants and Sex and Gender in Research.

Reporting on sex and gender
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Field-specific reporting
Please select the one below that is the best fit for your research.If you are not sure, read the appropriate sections before making your selection.

Life sciences
Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
No statistical methods were used to predetermine sample sizes but our sample sizes are similar to those reported in previous publications Data exclusions Prior to downstream analysis and experiments, resorping embryos were excluded.For the downstream experiments with the two-color lineage-tracing, only embryos with gut-specific mCherry+ signal were used, mCherry+ only embryos were excluded.No other data was excluded.

Replication
For RNAseq and RRBS experiments three to four replicates were generated.For E9.5, embryos were pooled while for E6.5 and E13.5, single embryo replicates were generated.For WGBS experiments, two replicates were generated for each E6.5 tissue (exEndo 1 and 2) and one replicate was generated for each E9.5 tissue.For the E9.5 scRNA-seq analysis, one experiment was performed, which contained cells of different sort groups (dual+ low, intermediate and high populations, mCherry+ population) from 15 pooled embryos.For the E13.5 scRNA-seq analysis, four WT embryos and four Trp53 knockout embryos were included in the experimental set-up labeled by MULTI-seq barcodes, which allowed comparison of cell state distributions across single embryo replicates.For imaging experiments and FACS analysis, 3-10 embryos were analyzed, the exact number is indicated in the respective figure or legend.All attempts at replication were successful.
Randomization For assessing the outcome of the complementation assays, embryos were collected without a preconceived selection strategy or prioritization by morphology.Our genomic analyses are independent of human intervention and analyze each sample equally and in an unbiased fashion.

Blinding
Data collection and analysis were not performed blind to the conditions of the experiments.Blinding was not relevant for this study since this is not an intervention study.However, our analytical pipeline followed uniform criteria applied to all samples, allowing us to analyse our data in an unbiased manner.

Wild animals
The study did not involve wild animals.
Reporting on sex GFP+ embryos originating from the lineage tracing assay are male because the mESC line used in the complementation assay is male.mCherry+ pre-implantation embryos were generated via natural mating, resulting in both male and female cells.
Field-collected samples No field-collected samples were used in this study.

Ethics oversight
All research described here complies with the relevant ethical regulations and was approved by the LAGeSo Berlin, license number G0243/18 and G0098/23.
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Flow Cytometry Plots
Confirm that: The axis labels state the marker and fluorochrome used (e.g.CD4-FITC).
The axis scales are clearly visible.Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.
A numerical value for number of cells or percentage (with statistics) is provided.

Methodology Sample preparation
Deciduae were collected into ice-cold HBSS (Gibco #14175095), E9.5 embryos (somite number 18-28) were dissected in icecold M2 medium (Merck #MR-015-D), the extraembryonic tissues were completely removed, and the yolk sac was kept.For the single-cell RNA-seq analysis to determine the cell type identities of mCherry+ and dual+ cells, whole lineage-traced embryos were used.For assessing extraembryonic cell content in lineage-traced embryos (comparing wild type and p53 extraembryonic-specific knockout), the embryos were cut into two halves with a micro knife along the anterior-posterior axis, and the posterior half was used further.For RNA-seq, RRBS, and WGBS experiments, wild type lineage-traced E9.5 embryos were cut into two halves with a micro knife along the anterior-posterior axis.From the posterior half, the midgut was manually isolated using tungsten needles (Fine Science Tools #10130-10), and the most posterior part was also kept containing the hindgut.For each midgut and hindgut replicate, corresponding tissues from four embryos were pooled.The embryos, the isolated tissues, and the yolk sac were washed in ice-cold HBSS, dissociated with 0,25 % Trypsin-EDTA (Gibco #25200056) for 10 minutes at 37°C to obtain single cells.This was quenched with KnockOut DMEM (Thermo Fisher Scientific #10829018) with 10% FBS (PAN-Biotech #P30-2602) and 0,05 mg/ml DNase I (Merck #11284932001) to dissociate the cells via pipetting, and the cells were also washed once with this buffer.After blocking with Normal Mouse Serum (Invitrogen #31881) for 5 minutes on ice, cells were stained for EPCAM (Alexa Fluor® 647 anti-EPCAM, BioLegend #118212) in FACS buffer (HBSS with 2% FBS and 0,5 mM EDTA (Thermo Fischer Scientific #15575020)) for 10 minutes on ice.Specifically for the pooled midgut and hindgut samples, enrichment of EPCAM+ cells was performed by magnetic separation (MACS) using Anti-Cy5/Anti-Alexa Fluor 647 MicroBeads , following the manufacturer's instructions with the MS columns  and the OctoMACS™ Separator .Last, cells were stained with DAPI (0.02 %, Roche Diagnostics #102362760019) in FACS buffer for 8 minutes on ice, then were washed once and resuspended in FACS buffer, and kept on ice until flow cytometry analyses or sorting.E13.5 embryos were dissected in DMEM/F-12 (Thermo Fischer Scientific #21041025) with 10 % FBS (PAN-Biotech #P30-2602).For scRNA-seq analysis of the wild type and p53 knockout embryos (see below), the gastrointestinal tract was isolated, then dissociation, staining for EPCAM, enrichment by MACS, and preparation for FACS was performed as described above for E9.5 midgut and hindgut samples.For RNA-seq and RRBS, E13.5 lineage-traced embryos were collected, and the intestine was isolated, and split into the small intestine and colon parts with a micro knife.Then, dissociation, staining for EPCAM, and sample preparation for FACS were performed as described above for E9.5 embryos.

Cell population abundance
Given the low input for our sorting experiments, sort check was not performed on the sorted material, instead separate samples of the same type were used to sort the desired populations and perform post-sort checks which confirmed the purity of the sort-test sample.

Gating strategy
For analysis of extraembryonic gut cell content in embryos and organs, the following gating strategy was set up.First, an FSC-A vs SSC-A gating was used to identify the cell population.Next, two doublet removal steps were performed (FSC-W vs FSC-H and SSC-W vs SSC-H).Alive cells were gated based on DAPI, then epithelial/endoderm cells were gated based on EPCAM.For sorting gut endoderm and yolk sac endoderm cells, the following strategy was used.First, an FSC-A vs SSC-A gating was used to identify the cell population.Alive cells were gated based on DAPI, then epithelial/endoderm cells were gated based on EPCAM.Next, two doublet removal steps were performed (FSC-W vs FSC-H and SSC-W vs SSC-H).Finally, cells were sorted based on GFP and mCherry intensities.For sorting endoderm cells from the gastrointestinal tracts, the following strategy was used.First, an FSC-A vs SSC-A gating was used to identify the cell population.Alive cells were gated based on DAPI, then epithelial/endoderm cells were gated based on EPCAM.Next, two doublet removal steps were performed (FSC-W vs FSC-H 6), BSMAP (version 2.90), MOABS (version 1.3.2),GATK (version 4.3.0.0),BD FACS Diva (version 8.0.1),Zeiss ZEN Blue (version 3.5), ZEN 2014 Data analysis R (version 4.1.0),Seurat (version 4.1.0),DESeq2 (version 1.32.0),WebGestaltR (version 0.4.